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Abstract 

By  exploring  the  relationship  between  parsing  and 
deduction,  a  new  and  more  genera]  view  of  chart  parsing 
is  obtained,  which  encompasses  parsing  for  grammar 
formalisms  based  on  unification,  and  is  the  basis  of  the 
Earley  Deduction  proof  procedure  for  definite  clauses. 
The  efficiency  of  this  approach  for  an  interesting  class  of 
grammars  is  discussed. 

1.  Introduction 

The  aim  of  this  paper  b  to  explore  the  relationship 
between  parsing  and  deduction.  The  basic  notion,  which 
goes  back  to  Kowalski  (Kowabki,  1980)  and  Colmerauer 
(Colmerauer,  1978),  has  seen  a  very  efficient,  if  limited, 
realization  in  tbe  use  of  the  logic  programming  language 
Prolog  for  parsing  (Colmerauer,  1978;  Pereira  and 
W'arren,  1980).  The  connection  between  parsing  and 
deduction  was  developed  further  in  the  design  of  the 
Earley  Deduction  proof  procedure  (Warren,  1975),  which 
will  abo  be  dbcussed  at  length  here. 

Investigation  of  the  connection  between  parsing  and 
deduction  yields  several  important  benefits: 

•  A  theoretically  clean  mechanbm  to  connect  parsing 
with  the  inference  needed  for  semantic 
interpretation. 

•  Handling  of  gaps  and  unbounded  dependencies  “on 
the  fly’’  without  adding  special  mechanbms  to  the 
parser. 

•  A  reinterpretation  and  generalization  of  chart 
parsing  that  abstracts  from  unessential  data* 
structure  details. 

•  Techniques  that  are  applicable  to  parsing  in  related 
formalisms  not  directly  based  on  logic. 
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•  Elucidation  of  parsing  complexity  issues  for  related 
formalbms,  in  particular  lexicabfunctional  grammar 
(LFG). 

Our  study  of  these  topics  b  still  far  from  complete; 
therefore,  besides  offering  some  initial  results,  we  shall 
dbcuss  various  outstanding  questions. 

The  connection  between  parsing  and  deduction  b  based 
on  the  axiomatization  of  context-free  grammars  in 
definite  cUoaes,  a  particularly  simple  subset  of  Tirst- 
order  logic  (Kowabki,  1980;  van  Emden  and  Kowabki, 
1976).  'Thb  axiomatization  allows  us  to  identify  context- 
free  parsing  algorithms  with  proof  procedures  for  a 
restricted  class  of  deflnite  clauses,  those  derived  from 
context-free  rules.  Thb  identification  can  then  be 
generalized  to  include  larger  classes  of  definite  clauses  to 
which  the  same  algorithms  can  be  applied,  with  simple 
modifications.  Those  larger  classes  of  defmite  clauses  can 
be  seen  as  grammar  formalbms  in  which  the  atomic 
grammar  symbob  of  context-free  grammars  have  been 
replaced  by  complex  symbob  that  are  matched  by 
unification  (Robinson,  1965;  Colmerauer,  1878;  Pereira 
and  Warren,  1980).  The  simplest  of  these  formalbms  b 
definite-clause  grammars  (DCG)  (Pereira  and  Warren, 
1980). 

There  b  a  close  relationship  between  DCGs  and  other 
grammar  formalbms  based  on  unification,  such  as 
Unification  Grammar  (UG)  (Kay,  1079),  LFG,  PATR-2 
(Shieber,  1983)  and  the  more  recent  versions  of  GPSG 
(Gazdar  and  Pullum,  1982). 

The  parsing  algorithms  we  are  concerned  with  are 
online  algorithms,  in  tbe  sense  that  they  apply  the 
constraints  specified  by  the  augmentation  of  a  rule  as 
soon  as  tbe  rule  b  applied.  In  contrast,  an  offline  parsing 
algorithm  will  consbt  of  two  phases:  a  context-free 
parsing  algorithm  followed  by  application  of  the 
constraints  to  all  the  resulting  analyses. 

Tbe  paper  b  organized  as  follows.  Section  2  gives  an 
overview  of  the  concepts  of  defmite  clause  logic,  definite 
clause  grammars,  definite  clause  proof  procedures,  and 
chart  parsing.  Section  3  dbcusses  the  connection  betwee 
DCGs  and  LFG.  Section  4  describes  the  Earley 
Deduction  definite-clause  proof  procedure.  Section  5  then 
brings  out  the  connection  between  Earley  Deduction  and 
chart  parsing,  and  shows  tbe  added  generality  brought  in 
by  the  proof  procedure  approach.  Section  6  outlines  some 
of  tbe  problems  of  implementing  Earley  Deduction  and 
similar  parsing  procedures.  Findly,  Section  7  dbcusses 
questions  of  computational  complexity  and  decidability. 


2.  Basic  Notions 

2.1.  Definite  Clauses 
A  definite  clause  has  the  form 

to  be  read  as  “/’is  true  if  Qy  and  ...  and  are  true”.  If 
n  =  0,  the  clause  is  a  onit  clause  and  is  written  simply  as 

P. 

P and  ...  ,  are  literals.  Pis  the  poalthre  literal 
or  head  -of  the  clause;  ...  ,  are  the  negative 
literals,  forming  the  body  of  the  clause.  Literals  have  the 
form  where  p  is  the  predicate  of  arity  k  and 

the  the  arguments.  The  arguments  are  temos.  A 
term  may  be:  a  variable  (variable  names  start  with 
capital  letters);  a  constant;  a  compound  term 

where  /  is  a  functor  of  arity  m  and  the  /,•  are 

terms.  All  the  variables  in  a  clause  are  implicitly 
universally  quantified. 

A  set  of  definite  clauses  forms  a  program,  and  the 
clauses  in  a  program  are  called  input  clauses.  A 
program  defines  the  relations  denoted  by  the  predicates 
appearing  in  the  heads  of  clauses.  When  using  a  definite- 
clause  proof  procedure,  such  as  Prolog  (Roussel,  1975)  a 
goat  statement 

*=  P. 

^quests  the  proof  procedure  to  find  provable  instances  of 


2.2.  Definite  Clause  Grammars 

Any  conte.xt-free  rule 


A'-o,  ... 

can  be  translated  into  a  definite  clause 


*(So'Sn)  ^  ®i(So>Si)  &  ...  &  On(Sn-J>S„)' 

The  variables  S,-  are  the  string  arguments,  representing 
positions  in  the  input  string.  For  example,  the  context-free 
rule  “S  NP  VP"  is  translated  into  “8(S0,S21  ^ 
np(S0,Sl)  &  vp(Sl,S2),”  which  can  be  paraphrased  as 
there  is  an  S  from  SO  to  S2  in  the  input  striDz  if  there  is 
an  NP  from  SO  to  Si  and  a  VP  from  Si  to  S2." 


Given  the  translation  of  a  context-free  grammar  G  with 
start  symbol  S  into  a  set  of  definite  clauses  G  ^  with 
corresponding  predicate  a,  to  say  that  a  string  tn  is  in  the 
grammar’s  language  is  equivalent  to  saying  that  the  start 
goal  a(po,p)  is  a  consequence  of  G'  U  W,  where  p^  and  p 

represent  the  left  and  right  endpoints  of  and  W  is  a  set 
of  unit  clauses  that  represents  w. 

is  to  generalize  the  above  notions  to  define 
DCGs,  DCG  nonterminab  have  arguments  in  the  same 
way  that  predicates  do.  A  DCG  nonterminal  with  n 
arguments  is  translated  into  a  predicate  of  n-j-2 
arguments,  the  last  two  of  which  are  the  string  points,  as 


in  the  translation  of  context-free  rules  into  definite 
clauses.  The  context-free  grammar  obtained  from  a  DCG 
by  dropping  all  nonterminal  arguments  is  the  eontext- 
f^ee  skeleton  of  the  DCG. 

2.3.  Deduction  In  Definite  Clauses 

The  fundamental  inference  rule  for  definite  clauses  is 
the  following  resolution  rule:  From  the  clauses 

B  c=  Aj  &  ...  &  A^.  (1) 

C  <=  Dj  &  ...  £  Dj  &  ...  &  D„.  (2) 

when  B  and  D-  are  unifiable  by  substitution  a,  infer 

cr[C<= 

D,  &  ...  D-.,  &  A,  &  ...  &  ...  £  D^.\  (3) 

Clause  (3)  is  a  derived  clause,  the  resolvent  of  (1)  and 

(2). 

The  proof  procedure  of  Prolog  is  just  a  particular 
embedding  of  the  resolution  rule  in  a  search  procedure,  in 
which  a  goal  clause  like  (2)  is  successively  rewritten  by 
the  resolution  rule  usiug  clauses  from  the  program  (1). 
The  Prolog  proof  procedure  can  be  implemented  very 
efficiently,  but  it  has  the  same  theoretical  problems  of  the 
top-down  backtrack  parsing  algorithms  after  which  it  is 
modeled.  These  problems  do  not  preclude  its  use  for 
creating  uniquely  efficient  parsers  for  suitably  constructed 
grammars  (Warren  and  Pereira,  1983;  Pereira,  1982),  but 
the  broader  questions  of  the  relation  between  parsing  and 
deduction  and  of  the  derivation  of  online  parsing 
algorithms  for  unification  formalisms  require  that  we  look 
at  a  more  generally  applicable  class  of  proof  procedures. 

2.4.  Chart  Parsing  and  the  Earley  Algorithm 

Chart  parsing  is  a  general  framework  for  constructing 
parsing  algorithms  for  context-free  grammars  and  related 
formalisms.  The  Earley  context-free  parsing  algorithm, 
although  independently  developed,  can  be  seen  as  a 
particular  case  of  chart  parsing.  We  will  give  here  just 
the  basic  terminology  of  chart  parsing  and  of  the  Earley 
algorithm.  Full  accounts  can  be  found  in  the  articles  by 
Kay  (Kay,  1980)  and  Earley  (Earley,  1970). 

The  state  of  a  chart  parser  is  represented  by  the  chart, 
which  is  a  directed  graph.  The  nodes  of  the  chart 
represent  positions  in  the  string  being  analyzed.  Each 
edge  in  the  chart  is  either  active  or  passive.  Both  types 
of  edges  are  labeled.  A  passive  edge  with  label  N  links 
node  r  to  node  s  if  the  string  between  r  and  a  has  been 
analyzed  as  a  phrase  of  type  N.  Initially,  the  only  edges 
are  passi\'e  edges  that  Imk  consecutive  nodes  and  are 
labeled  with  the  words  of  the  input  string  (see  Figure  1). 
Active  edges  represent  partially  applied  grammar  rules. 
Id  the  simplest  case,  active  edges  are  labeled  by  dotted 
rules.  A  doded  rule  is  a  grammar  rule  with  a  dot  inserted 
somewhere  on  its  right-hand  side 

(4) 

An  edge  with  this  label  links  node  r  to  node  a  if  the 
sentential  form  Oj  ...  is  an  analysis  of  the  input  string 
between  r  and  a.  An  active  edge  that  links  a  node  to 


itself  is  called  empty  and  acts  like  a  top-down  prediction. 
Chart-parsing  procedures  start  with  a  chart  containing 
the  passive  edges  for  the  input  string.  New  edges  are 
added  in  two  distinct  ways.  First,  an  active  edge  from  r  to 
3  labeled  with  a  dotted  rule  (4)  combines  with  a  passive 
edge  from  s  to  t  with  label  a-  to  produce  a  new  edge  from 
r  to  I,  which  will  be  a  passive  edge  with  label  JT  if  o,-  b 
the  last  symbol  in  the  right-hand  side  of  the  dotted  rule; 
otherwise  it  will  be  an  active  edge  with  the  dot  advanced 
over  a,.  Second,  the  parsing  strategy  must  place  into  the 
chart,  at  appropriate  points,  new  empty  active  edges  that 
will  be  used  to  combine  existing  passive  edges.  The  exact 
method  used  determines  whether  the  parsing  method  is 
seen  as  top-down,  bottom-up,  or  a  combination  of  the 
two. 

The  Earley  parsing  algorithm  can  be  seen  as  a  special 
case  of  chart  parsing  in  which  new  empty  active  edges  are 
introduced  top-down  and,  for  all  k,  the  edge  combinations 
involving  only  the  first  Je  nodes  are  done  before  any 
combinations  that  involve  later  nodes.  Thb  particular 
strateg}’  allows  certain  simplifications  to  be  made  in  the 
general  algorithm. 

3-  DCGs  and  LFG 

We  would  like  to  make  a  few  informal  observations  at 
this  point  to  clarify  the  relationship  between  DCGs  and 
other  unification  grammar  formalbms  —  LFG  in 
particular.  A  more  detailed  dbcussion  would  take  us 
beyond  the  intended  scope  of  thb  paper. 

The  different  nolational  conventions  of  DCGs  and  LFG 
make  tbe  two  formalisms  less  similar  on  the  surface  than 
they  actually  are  from  the  computational  point  of  view. 
Tbe  objects  that  appear  as  arguments  in  DCG  rules  are 
tree  fragments  every  node  of  which  has  a  number  of 
children  predetermined  by  the  functor  that  labeb  the 
node.  Explicit  variables  mark  unspecified  parts  of  the 
tree.  In  contrast,  tbe  functional  structure  nodes  that  are 
implicitly  mentioned  in  LFG  equations  do  not  have  a 
predefined  number  of  children,  and  unspecified  parts  are 
either  omitted  or  defined  implicitly  through  equations. 

As  a  first  approximation,  a  DCG  rule  such  as 

s(s(Subj,Obj))  -*  np(Subj)  vp(Obj)  (5) 

might  correspond  to  the  LFG  rule 

S  —  NP  VP  (6) 

1  subj  =  1  1  obj  =  1 

The  DCG  rule  can  be  read  as  'an  s  with  structure 

> 

/  \ 

Subj  Dbj 

b  an  np  with  structure  Svbj  followed  by  a  vp  with 
structure  Obj.*  The  LFG  rule  can  be  read  as  *an  S  b  an 
NP  followed  by  a  VP,  where  the  value  of  the  subj 
attribute  of  the  S  b  the  functional  structure  of  tbe  NP 
and  the  value  of  the  attribute  obj  of  the  S  b  the 
functional  structure  of  the  VP.'  For  those  familiar  with 


the  detaib  of  the  mapping  from  functional  descriptions  to 
functional  structures  in  LFG,  DCG  variables  are  just 
"placeholder”  symbob  (Bresnan  and  Kaplan,  1982). 

As  we  noted  above,  an  apparent  difference  between 
LFG  and  DCGs  b  that  LFG  functional  structure  nodes, 
unlike  DCG  function  symbob,  do  not  have  a  definite 
number  of  children.  Although  we  must  leave  to  a 
separate  paper  the  detaib  of  the  application  to  LFG  of 
tbe  unification  algorithms  from  theorem  proving,  we  will 
note  here  that  the  formal  properties  of  logical  and  LFG  or 
UG  unification  are  similar,  and  there  are  adaptations  to 
LFG  and  UG  of  the  algorithms  and  data  structures  used 
in  the  logical  case. 

4.  Earley  Deduction 

The  Earley  Deduction  proof  procedure  schema  b  named 
after  Earley's  context-free  parsing  algorithm  (Earley, 
1970),  on  which  it  b  based.  Earley  Deduction  provides 
for  definite  clauses  the  same  kind  of  mixed  top-down 
bottom-up  mechanbm  that  the  Earley  parsing  algorithm 
provides  for  context-free  grammars. 

Earley  Deduction  operates  on  two  sets  of  definite  clauses 
called  tbe  program  and  the  state.  The  program  b  just 
the  set  of  input  ctanaea  and  remains  Hxed.  The  state 
consists  of  a  set  of  deriyed  clauses,  where  each  nonunit 
clause  has  one  of  its  negative  literab  selected;  the  state  b 
continually  being  added  to.  Whenever  a  nonunit  clause  b 
added  to  the  state,  one  of  its  negative  literab  b  selected. 
Initially  the  state  contains  just  the  goal  statement  (with 
one  of  its  negative  literab  selected). 

There  are  two  inference  rules,  called  Instantiation  and 
reduction,  which  can  map  the  current  state  into  a  new 
one  by  adding  a  new  derived  clause.  For  an  instantiation 
step,  there  b  some  clause  in  the  current  state  whose 
selected  literal  unifies  with  the  positive  literal  of  a 
nonunit  clause  C  in  the  program.  In  thb  case,  tbe 
derived  clause  b  a\C\,  where  (t  b  a  most  general  unifier 
(Robinson,  1965)  of  the  two  literab  concerned.  The 
selected  literal  b  said  to  instantiate  C  to  cr[C\. 

For  a  reduction  step,  there  b  some  clause  C  in  the 
current  state  whose  selected  literal  uniOes  with  a  unit 
clause  from  either  the  program  or  tbe  current  state.  In 
this  case,  tbe  derived  clause  b  where  rr  b  a  most 

general  unifier  of  the  two  literab  concerned,  and  C"  is  C 
minus  its  selected  literal.  Thus,  tbe  derived  clause  b  just 
tbe  resolvent  of  C  with  tbe  unit  clause  and  tbe  latter  b 
said  to  reduce  C  to  cr[C']. 

Before  a  derived  clause  b  added  to  tbe  state,  a  check  b 
made  to  see  whether  tbe  derived  clause  b  subsumed  by 
any  clause  already  in  the  state.  If  the  derived  clause  b 
subsumed,  it  b  not  added  to  tbe  state,  and  that  inference 
step  b  said  to  be  blocked. 

In  tbe  examples  that  follow,  we  assume  that  the  selected 
literal  in  a  derived  clause  b  always  the  leftmost  literal  in 
the  body.  Thb  choice  b  not  optimal  (Kowabki,  1980), 
but  it  b  sufficient  for  our  purposes'. 

For  example,  given  the  program 


c(>r.Z)  «=  cCX,Y)  &  cCY.Z). 

c(l,2). 

c(2.3). 

and  goal  statement 
ans(Z)  «=  c(l,Z). 


(7)  NP  —  Det  N 

(g)  Det  -*  NP  Gen 

(9)  Det  -♦  Art 

Det  A 
VP  —  VNP 

(1®)  corresponds  to  the  following  definite-clause  program: 


here  is  a  sequence  of  clauses  derived  by  Early  Deduction 


ubCZ)  e(l,Z). 

e(l,Z)  *=  ed.T)  t  c(T,Z). 

us(2). 

c(l,Z)  •  c(2.Z>. 
e(2.Z)  e(2,t)  «  e(T.Z). 


c(l,3). 

■saO) . 

c(2,Z)  ^  c(3.Z). 
e(3.Z)  *=  e(3,T)  t  c(7,Z). 


goal  statBseat 
(11)  Inatantlatas  (7) 
(S)  rtdueta  (11) 

(B)  raducia  (12) 

(14)  iBitaatlataa  (7) 
(9)  raducaa  (14) 
dS)  raducaa  (11) 

(9)  raducaa  (IS) 
d8)  laatantlataa  (7) 


At  this  point,  all  further  steps  are  blocked,  so 
computation  terminates. 


(11) 

(12) 

(13) 

(14) 
(16) 
(16) 
(17) 
(IB) 
(19) 

the 


s(S0,S)  <=  np(S0,Sl)  &  vp(Sl,S). 
np(S0,S)  «=  det(S0,Sl)  &  n(Sl,S). 
det(SO,S)  «=  np(S0,Sl)  &  gen(SZ,S). 
det(S0,S)  «=  arl(S0,S). 
det(S,S). 

vp{SO,S)  «=  v(SO,Sl)  &  np(Sl,S). 
The  lexical  categories  of  the  sentence 
QAgathaj’s2husbaDd3hit^Ulrich5 
can  be  represented  by  the  unit  clauses 


(20) 

(21) 

(22) 

(23) 

(24) 

(25) 


(26) 


Earley  Deduction  generalizes  Earley  parsing  in  a  direct 
and  natural  way.  Instantiation  is  analogous  to  the 
“predictor"  operation  of  Earley’s  dgorithm,  while 
reduction  corresponds  to  the  “scanner”  and  “completer 
operations.  The  “scanner"  operation  amounts  to 
reduction  with  an  input  unit  clause  representing  a 
terminal  symbol  occurrence,  while  the  “completer 
operation  amounts  to  reduction  with  a  derived  unit  clause 
representing  a  nonterminal  symbol  occurrence. 

5.  Chart  Parsing  and  Earley  Deduction 

Chart  parsing  (Kay,  1980)  and  other  tabular  parsing 
algorithms  (Aho  and  Ullman,  1072;  Gr^am  et  al.,  1980) 
arc  usually  presented  in  terms  of  certain  (abstract)  data 
structures  that  keep  a  record  of  the  alternatives  being 
explored  by  the  parser.  Looking  at  parsing  procedures  M 
proof  procedures  has  the  following  advantagra;  (i) 
unification,  gaps  and  unbounded  dependencies  arc 
automatically  handled;  (ii)  parsing  strategies  become 
possible  that  cannot  be  formulated  in  chart  parsing. 

The  chart  represents  completed  nonterminals  (passive 
edges)  and  partially  applied  rules  (active  edges).  From  the 
standpoint  of  Earley  Deduction,  both  represent  derived 
clauses  that  have  been  proved  in  the  course  of  an  attempt 
to  deduce  a  goal  statement  whose  meaning  is  that  a  string 
belongs  to  the  language  generated  by  the  grammar.  ^ 
active  edge  corresponds  to  a  nonunit  clause,  a  passive 
edge  to  a  unit  clause.  Nowhere  in  this  definition  is  there 
mention  of  the  “endpoints"  of  the  edges.  The  endpoints 
correspond  to  certain  iiterd  arguments,  and  are  of  no 
concern  to  the  (abstract)  proof,  procedure.  Endpointe  are 
just  a  convenient  way  of  indexing  derived  clauses  in  an 
implementation  to  reduce  the  number  of  nonproductive 
(nonunifying)  attempts  at  applying  the  reduction  rule. 

We  shall  give  now  an  example  of  the  application  of 
Earley  Deduction  to  parsing,  corresponding  to  the  chart 
of  Figure  1. 

The  CFG 

S  —  NP  VP 


n(0,I).  (27) 

Ben(l,2).  (28) 

n(2.3).  (29) 

v(3,4).  (30) 

u(4,5).  (31) 

Thus,  the  task  of  determining  whether  (26)  b  a  sentence 
can  be  represented  by  the  goal  statement 

ans  ^  s(0,5).  (32) 


If  the  sentence  is  in  the  language,  the  unit  clause  ans  will 
be  derived  in  the  course  of  an  Earley  Deduction  proof. 
Such  a  proof  could  proceed  as  follows: 


UB  a(0,6} .  goal 

a(0,6)  BpCO.Sl)  t  TpCSl.B). 

(33) 

Bp(O.S)  *=  det(0,Sl)  t  D(Sl.S). 

(34) 


dtt(0,5}  *=  Dp(O.Sl) 
datCO.S)  *=  artCO.S) 

np(0,S)  <=  n(0,sT. 

DP  (0,1). 

a(07By  •»=  vpCl.B). 
vpTl.6)  *=  7(T.S1)  t 

det(0,S)  t=  gasCl.S) 
dBt(0,2). 

np(07sT  *=  D(2,S) . 

Bp(0,3). 

■  (o76T  7p(3.6). 

dct(0,S)  *=  gaaO.S) 
7p(376)  *=  7(3,51)  k 


vp(3,6) 

np(4,B) 


Bp(4,8) . 
dBtT4,Sl) 


k  gas (51, 5) . 

(35) 

(3B) 

(24) 

(27) 

(39) 

np(51,B). 

(40) 
(39) 
(29) 
(43) 

(29) 
(4B) 
(46) 

Bp(Sl,B) . 

(46) 

(30) 

k  b(51.6). 

(49) 


dat(4,S)  ^  Dp(4,Sl)  k  geB(Sl,S}. 


(BO) 

dat(4,S)  aa  art(4,S).  (60) 

ap(47s)  ^  dat(47si)  I  n(Sl,S). 

(SI) 

Bp(4,6)  li(4,6).  (24) 

Bp(4,S)  ^  b(4.S).  (24) 

Dp  (4, 6).  “  (31) 

TP (3,6).  (66) 

datT475)  4-  geD(6,S>.  (66) 


atataaant 

iDsiaatlataa 

iaataDtlataa 

Inatastlatea 
iBBtaatlatas 
raducaa  (36) 
raducaa  (38) 
raducaa  (34) 

iDatastlataa 
raducaa  (36) 
raducaa  (42) 
raducaa  (36) 
raducaa  (44) 
raducaa  (34) 
raducaa  (36) 

iBBtastiataa 
raducaa  (4B) 

IsataBtlataa 

iBBtaatlatae 

iBBtaatlataa 

inttastlataa 
raducaa  (60) 
reduces  (63) 
raducaa  (64) 
raducaa  (49) 
raducaa  (61) 


(20) 

(21) 

(22) 

(23) 


(26) 


(26) 

(21) 

(22) 

(23) 

(21) 


(33) 

(34) 
(36) 

(36) 

(37) 

(38) 

(39) 

(40) 

(41) 

(42) 

(43) 

(44) 

(46) 

(46) 

(47) 

(48) 

(49) 

(60) 

(61) 

(62) 

(63) 

(64) 
(66) 
(66) 

(67) 

(68) 


s(0,5) . 
xna. 


(67)  reducvi  (46)  (69) 

(69)  redueas  (33)  (60) 


Note  bov  subsumption  is  used  to  curtail  the  left  recursion 
of  rules  (21)  and  (22),  by  stopping  extraneous 
instantiation  steps  from  the  derived  clauses  (35)  and  (^). 
As  we  have  seen  in  the  example  of  the  previous  section, 
this  mechanism  is  a  gener^  one,  capable  of  handling 
complex  grammar  symbob  within  certain  constraints  that 
will  be  dbcussed  later. 


The  Earley  Deduction  derivation  given  above 
corresponds  directly  to  the  chart  in  Figure  1. 


In  general,  chart  parsing  cannot  support  strategies  that 
would  create  active  edges  by  reducing  the  symbob  in  the 
right-hand  side  of  a  rule  in  any  arbitrary  order.  Thb  b 
because  an  active  edge  must  correspond  to  a  contiguous 
sequence  of  analyzed  symbob.  Definite  clause  proof 
procedures  do  not  have  thb  limitation.  For  example,  it  b 
very  simple  to  define  a  strategy,  “head  word  parsing”^ 
(McCord,  1980),  which  would  use  the  reduction  rule  to 
infer 

np(S0,S)  «=  det(S0,2)  &  rel(3.S). 


8 


Vp 

Figure  1:  Chart  vs.  Earley  Deduction  Proof 


Each  arc  in  the  chart  b  labeled  with  the  number  of  a 
clause  in  the  proof.  In  each  clause  that  corresponds  to  a 
chart  arc,  two  literal  arguments  correspond  to  the  two 
endpoints  of  the  arc.  These  arguments  have  been 
underlined  in  the  derivation.  Notice  how  the  endpoint 
ar^mcnts  are  the  two  string  arguments  in  the  head  for 
unit  clauses  (passive  edges)  but,  in  the  case  of  nonunit 
clauses  (passive  edges),  are  the  first  string  argument  in  the 
head  and  the  first  in  the  leftmost  literal  in  the  body. 

As  we  noted  before,  our  view  of  parsing  as  deduction 
makes  it  possible  to  derive  general  parsing  mechanbms  for 
augmented  phrase-structure  grammars  with  gaps  and 
unbounded  dependencies.  It  b  difficult  (especially  in  the 
case  of  pure  bottom-up  parsing  strategies)  to  augment 
chart  parsers  to  handle  gaps  and  dependencies 
(Thompson,  1081).  However,  if  gaps  and  dependencies 
are  specified  by  extra  predicate  arguments  in  the  clauses 
that  correspond  to  the  rules,  the  general  proof  procedures 
will  handle  those  phenomena  without  further  change. 
This  is  the  technique  used  in  DCGs  and  b  the  basb  of  the 
specialized  extraposition  grammar  formalbm  (Pereira, 
1081). 

The  increased  generality  of  our  approach  in  the  area  of 
parsing  strategy  stems  from  the  fact  that  chart  parsing 
strategies  correspond  to  specialized  proof  procedures  for 
definite  clauses  with  string  arguments.  In  other  words,  the 
origin  of  these  proof  procedures  means  that  string 
arguments  are  treated  differently  from  other  arguments, 
as  they  correspond  to  the  chart  nodes. 


from  the  clauses 

np(S0,S)  «=  det(S0,Sl)  &  n(Sl,S2)  &  rel(S2,S). 

[NP  —  Det  N  Rel] 
n(2,3). 

[There  b  an  N  between  points  2  and  3  in  the  input] 

This  example  shows  that  the  class  of  parsing  strategies 
allowed  in  the  deductive  approach  b  broader  than  what  b 
possible  in  the  chart  parsing  approach.  It  remains  to  be 
shown  which  of  those  strategies  will  have  practical 
importance  as  well. 

6.  Implementing  Earley  Deduction 

To  implement  Earley  Deduction  with  an  efficiency 
comparable,  say,  to  Prolog,  presents  some  challenging 
problems.  The  main  issues  are 

•  How  to  represent  the  derived  clauses,  especially  the 
substitutions  involved. 

•  How  to  avoid  the  very  heavy  computational  cost  of 
subsumption. 

•  How  to  recognize  when  derived  clauses  are  no  longer 

^bis  particular  strategy  could  be  implemented  in  a  chart  parser, 
by  changing  the  rules  for  combining  edges  but  the  generality 
demonstrated  here  would  be  lost. 


Deeded  and  space  can  be  recovered. 

There  are  two  basic  methods  for  representing  derived 
clauses  in  resolution  systems:  the  more  direct  copying 
method,  in  which  substitutions  are  applied  explicitly;  the 
structure-sharing  method  of  Boyer  and  Moore,  which 
avoids  copying  by  representing  derived  clauses  implicitly 
with  the  aid  of  variable  binding  environments.  A 
promising  strategy  for  Earley  Deduction  might  be  to  use 
copying  for  derived  unit  clauses,  structure  sharing  for 
other  derived  clauses.  When  copying,  care  should  be 
taken  not  to  copy  variable-free  subterms,  but  to  copy  just 
pointers  to  those  subterms  instead. 

It  is  very  costly  to  implement  subsumption  in  its  full 
generality.  To  keep  the  cost  within  reasonable  bounds,  it 
will  be  essential  to  Index  the  derived  clauses  on  at  least 
the  predicate  symbok  they  contain  —  and  probably  also 
on  symbols  in  certain  key  argument  positions.  A 
simplification  of  full  subsumption  checking  that  would 
appear  adequate  to  block  most  redundant  steps  is  to  keep 
track  of  selected  literak  that  have  been  used  exhaustively 
to  generate  instantiation  steps.  If  another  selected  literd 
is  an  instance  of  one  that  has  been  exhaustively  explored, 
there  is  no  need  to  consider  using  it  as  a  candidate  for 
instantiation  steps.  Subsumption  would  then  be  only 
applied  to  derived  unit  clauses. 

A  major  efficiency  problem  with  Earley  deduction  k 
that  it  k  difficult  to  recognize  situations  in  which  derived 
clauses  are  no  longer  needed  and  space  can  be  reclaimed. 
There  is  a  marked  contrast  with  purely  top-down  proof 
procedures,  such  as  Prolog,  to  which  highly  effective 
space  recovery  techniques  can  be  applied  relatively  easily. 
The  Earley  algorithm  pursues  ^1  possible  parses  in 
p.irallel,  indexed  by  string  position.  In  prmciple,  thk 
permits  space  to  be  recovered,  as  parsing  progresses,  by 
deleting  information  relating  to  earlier  string  positions.  It 
may  be  possible  to  gener^ize  thk  technique  to  Earley 
Deduction,  by  recognizing,  either  automatically  or 
manually,  certain  special  properties  of  the  input  clauses. 

7.  Decidability  and  Computational 
Complexity 

It  k  not  at  all  obvious  that  grammar  formalkms  based 
on  unification  can  be  parsed  within  reasonable  bounds  of 
time  and  space.  In  fact,  unrestricted  DCGs  have  Turing 
machine  power,  and  LFG,  although  decidable,  seems 
capable  of  encoding  exponentially  bard  problems. 
However,  we  need  not  give  up  our  interest  in  the 
complexity  analysk  of  unincation-based  parsing.  Whether 
for  interesting  subclasses  of  grammars  or  specific 
grammars  of  interest,  it  k  still  important  to  determine 
how  efficient  parsing  can  be.  A  basic  step  in  that  direction 
k  to  estimate  the  cost  added  by  unification  to  the 
operation  of  combining  (reducing  or  expanding)  a 
nonterminal  in  a  derivation  with  a  nonterminal  m  a 
grammar  rule. 

Because  definite  clauses  are  only  semidecidable,  general 
proof  procedures  may  not  terminate  for  some  sets  of 
definite  clauses.  However,  the  specialized  proof 
procedures  we  have  derived  from  parsing  algorithms  are 
stable:  if  a  set  of  definite  clauses  G  k  the  translation  of  a 


context-free  grammar,  the  procedure  will  always 
terminate  (in  success  or  failure)  when  to  proving  any  start 
goal  for  G.  More  interesting  in  thk  context  k  the  notion 
of  strong  stability,  which  depends  on  the  following 
notion  of  offline  parsability.  A  DCG  k  offline-parsahle 
if  its  context-free  skeleton  k  not  infinitely  ambiguous. 
Using  different  terminology,  Bresnan  and  Kaplan 
(Bresnan  and  Kaplan,  10S2)  have  shown  that  the  parsing 
problem  for  LFG  k  decidable  because  LFGs  are  offline 
parsable.  This  result  can  be  adapted  easily  to  DCGs, 
showing  that  the  parsing  problem  for  offline-parsable 
DCGs  is  decidable.  Strong  stability  can  now  be  defined:  a 
parsing  algorithm  is  strongly  stable  if  it  always  terminates 
for  offline-parsable  grammars.  For  example,  a  direct  DCG 
version  of  the  Earley  parsing  algorithm  k  stable  but  not 
strongly  so. 

In  the  following  complexity  arguments,  we  restrict 
ourselves  to  offline-parsable  grammars.  Thk  k  a 
reasonable  restriction  for  two  reasons:  (i)  smee  general 
DCGs  have  Turing  machine  power,  there  k  no  useful 
notion  of  computational  complexity  for  the  parser  on  its 
own;  (ii)  there  are  good  reasons  to  believe  that 
linguktically  relevant  grammars  must  be  offline-parsable 
(Bresnan  and  Kaplan,  1082). 

In  estimating  the  added  complexity  of  doing  online 
unification,  we  start  from  the  fact  that  the  length  of  any 
derivation  of  a  terminal  string  in  a  finitely  ambiguous 
context-free  grammar  k  linearly  bounded  by  the  length  of 
the  terminal  string.  The  proof  of  thk  fact  k  omitted  for 
lack  of  space,  but  can  be  found  ekewhere  (Pereira  and 
Warren,  1083). 

General  definite-clause  proof  procedures  need  to  access 
the  values  of  variables  (bindings)  in  derived  clauses.  The 
structure-sharing  method  of  representation  makes  the 
time  to  access  a  variable  binding  at  worst  linear  in  the 
length  of  the  derivation.  Furthermore,  the  number  of 
variables  to  be  looked  up  in  a  derivation  step  k  at  worst 
linear  in  the  size  of  the  derivation.  Finally,  the  time  (and 
space)  to  finish  a  derivation  step,  once  all  the  relevant 
bindings  are  known,  does  not  depend  on  the  size  of  the 
derivation.  Therefore,  using  thk  method  for  parsing 
offline-parsable  grammars  makes  the  time  complexity  of 
each  step  at  worst  o(n^)  in  the  length  of  the  input. 

Some  simplifications  are  possible  that  improve  that  time 
bound.  First,  it  k  possible  to  use  a  value  array 
representation  of  bindings  (Boyer  and  Moore,  1072)  while 
exploring  any  given  derivation  path,  reducing  to  a 
constant  the  variable  lookup  time  at  the  cost  of  having  to 
save  and  restore  o(n)  variable  bindings  from  the  value 
array  each  time  tbe  parsing  procedure  moves  to  explore  a 
different  derivation  path.  Secondly,  the  unification  cost 
can  be  made  independent  of  tbe  derivation  length,  if  we 
forgo  the  occurs  check  that  prevents  a  variable  from 
being  bound  to  a  term  containing  it.  Finally,  tbe 
combination  of  structure  sharing  and  copying  suggested  in 
the  last  section  eliminates  the  overhead  of  switching  to  a 
different  derivation  path  in  tbe  value  array  method  at  tbe 
cost  of  a  uniform  o(Iog  n)  time  to  look  up  or  create  a 
variable  binding  in  a  balanced  binary  tree. 

When  adding  a  new  edge  to  tbe  chart,  a  chart  parser 


must  verify  that  no  edge  with  the  same  label  between  the 
same  nodes  is  already  present.  In  general  DCG  parsing 
(and  therefore  in  online  parsing  with  any  unification- 
based  formalism),  we  cannot  check  for  the  “same  label” 
(same  lemma),  because  lemmas  in  general  will  contain 
variables.  We  must  instead  check  for  subsumption  of  the 
new  lemma  by  some  old  lemma.  The  obvious 
subsumption  checking  mechanism  has  an  o{n®)  worst  case 
cost,  hut  the  improved  binding  representations  described 
above,  together  with  the  other  special  techniques 
mentioned  in  the  previous  section,  can  be  used  to  reduce 
this  cost  in  practice. 

We  do  not  yet  have  a  full  complexity  comparison 
between  online  and  offline  parsing,  but  it  is  easy  to 
envisage  situations  in  which  the  number  of  edges  created 
by  an  online  algorithm  is  much  smaller  than  that  for  the 
corresponding  offline  algorithm,  whereas  the  cost  of 
applying  the  unification  constraints  is  the  same  for  both 
algorithms. 

8.  Conclusion 

We  have  outlined  an  approach  to  the  problems  of 
parsing  unification-based  grammar  formalisms  that  builds 
on  the  relationship  between  parsing  and  deHnite-clause 
deduction. 

Several  theoretical  and  practical  problems  remam. 
Among  these  are  the  question  of  recognizing  derived 
clauses  that  are  no  longer  useful  in  Earley-style  parsing, 
the  design  of  restricted  formalisms  with  a  polynomid 
bound  on  the  number  of  distinct  derived  clauses,  and 
independent  characterizations  of  the  classes  of  offline- 
parsable  grammars  and  languages. 
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